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Residual feed intake (RFI) is a complex trait that is economically important for livestock 
production; however, the genetic and biological mechanisms regulating RFI are largely 
unknown in pigs. Therefore, the study aimed to identify single nucleotide polymorphisms 
(SNPs), candidate genes and biological pathways involved in regulating RFI using Genome- 
wide association (GWA) and pathway analyses. A total of 596 Yorkshire boars with 
phenotypes for two different measures of RFI (RFI1 and 2) and 60k genotypic data was 
used. GWA analysis was performed using a univariate mixed model and 12 and 7 SNPs 
were found to be significantly associated with RFI1 and RFI2, respectively. Several genes 
such as xin actin-binding repeat-containing protein 2 (XIRP2), tetratricopeptide repeat 
domain 29 (TTC29), suppressor of glucose, autophagy associated 7 (SOGA1), MAS1, G- 
protein-coupled receptor (GPCR) kinase 5 (GRK5), prospero-homeobox protein 7 (PROX1), 
GPCR 155 (GPR155), and FYVE domain containing the 26 (ZFYVE26) were identified 
as putative candidates for RFI based on their genomic location in the vicinity of these 
SNPs. Genes located within 50 kbp of SNPs significantly associated with RFI and RFI2 
(q-value < 0.2) were subsequently used for pathway analyses. These analyses were 
performed by assigning genes to biological pathways and then testing the association 
of individual pathways with RFI using a Fisher's exact test. Metabolic pathway was 
significantly associated with both RFIs. Other biological pathways regulating phagosome, 
tight junctions, olfactory transduction, and insulin secretion were significantly associated 
with both RFI traits when relaxed threshold for cut-off p-value was used (p < 0.05). These 
results implied porcine RFI is regulated by multiple biological mechanisms, although 
the metabolic processes might be the most important. Olfactory transduction pathway 
controlling the perception of feed via smell, insulin pathway controlling food intake might 
be important pathways for RFI. Furthermore, our study revealed key genes and genetic 
variants that control feed efficiency that could potentially be useful for genetic selection of 
more feed efficient pigs. 
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INTRODUCTION 

Residual feed intake is denned as the difference between the 
observed feed intake and the feed intake predicted based on pro- 
duction traits such as average daily gain and backfat thickness. RFI 
is a sensitive and accurate indicator of feed efficiency in livestock 
that is being increasingly accepted as an alternative measure for 
feed efficiency in livestock species. Genetic selection for animals 
with reduced RFI can be advantageous from both economic and 
environmental perspectives (Dekkers and Gilbert, 2010; Cruzen 
etal., 2012; Saintilan etal, 2013). However, genetic variants and 
biological mechanisms regulating RFI need to be identified, which 



Abbreviations: Bp, base pairs; dEBV, deregressed estimated breeding values; EBV, 
estimated breeding values; FDR, false discovery rate; GWA, genome-wide associ- 
ation; MAF, minor allele frequency; Mb, mega base pairs; QTL, quantitative trait 
locus; RFI, residual feed intake; SNP, single nucleotide polymorphism. 



would help to improve genetic selection for this trait. GWA, a 
hypothesis-free approach that uses a large numbers of SNPs spread 
throughout the genome to identify quantitative trait loci (QTL) 
potentially harboring candidate loci, has been widely used to 
explore the genetics underlying complex traits. Past studies have 
led to the identification of many QTLs influencing feed conversion 
ratio (FCR) in pigs 1 . FCR is currently the only available measure 
of feed efficiency that is included in the selection index for the 
Danish pig breeds. However, ratio traits such as FCR are not ideal 
for statistical and biological reasons (Gunsett, 1984) and the accu- 
rate definition for feed efficiency in animals is still being debated. 
Recently, several studies have been conducted to identify QTLs 
and candidate genes putatively influencing RFI in pigs. Using a 



1 http://www.animalgenome.org/cgi-bin/QTLdb/SS/index 
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Pietrain-Large White backcross population, Gilbert etal. (2010) 
identified QTLs on pig chromosomes (SSC) 5 and 9 for RFI in 
growing pigs. In Yorkshire pigs, a GWA study revealed several 
significant SNPs on SSC 2, 3, 5, 7, 8, 9, 14, and 15 influencing 
RFI (Onteru etal, 2013). A candicate gene study performed by 
Fan etal. (2010) validated these SNPs in FTO and TCF7L2 genes 
as genetic markers for RFI in pigs. Recently, Shirali etal. (2013) 
detected novel QTLs for residual energy intake on SSC 2, 4, 7, 8, 
and 14 in a crossed populations (Pietrain grand-sires crossed with 
grand-dams bred from a three-way cross of Leicoma boars with 
Landrace x Large White dams). Sanchez etal. (2014) detected a 
SNP on SSC 6 for RFI in Large White pigs. 

Recently, we have identified significant SNPs on SSC 1, 
9, and 13 for RFI in Duroc pigs (Do etal, 2014). Danish 
Durocs, used as terminal sires in combination with crossbred 
LY sows (Landrace x Yorkshire), are bred with a higher empha- 
sis on growth and feed efficiency traits compared to Yorkshire 
pigs, where the emphasis is considerably more on improv- 
ing litter size. Given these differing emphases on selection, it 
is reasonable that the genetic architeture of these two breeds 
differs with respect to traits like feed efficiency that are target- 
ted more intesively for selection within Durocs. In accordance 
with this, we have found that the genetic variation (heritabil- 
ity) of RFI is higher in Yorkshire compared to Duroc pigs (Do 
etal., 2013a). Therefore, while the biological mechanisms are 
likely conserved even across species (Mayr, 1963; Raff, 1996), 
the genetic regulation of these mechanisms is not necessarily 
conserved, and investigating the genetics underlying the same 
phenotype in a different breed could provide novel insights into 
the biological mechanisms underlying feed efficiency. Compar- 
ing findings of genomic investigations on different breeds that 
have differing linkage disequilibrium (LD) structure could also 
potentially assist in narrowing the boundaries of putative QTL 
regions. 

While GWA studies have been reasonably successful, they 
often focus on a top few significant SNPs while ignoring other 
SNPs with lower significance levels that could still be biologi- 
cally relevant. Gene set enrichment and pathway analyses using 
publicly available biological databases could potentially comple- 
ment efforts to identify causal loci for complex traits, as has been 
shown in previous studies (Kadarmideen, 2008; Torkamani etal., 
2008; Wang etal., 2010). These approaches, instead of relying 
solely on statistically associated genetic variants, focus on bio- 
logical pathways that are mediated by genes located in the vicinity 
of these variants. Such approaches have been shown to provide 
valuable insights into the biology underlying complex pheno- 
types (Kadarmideen etal, 2006; Farber, 2013; Kadarmideen, 
2014). Therefore, the objective of our study was to use both 
GWA and pathway analyses to identify SNPs, genes, and biolog- 
ical pathways that could potentially influence RFI in Yorkshire 
pigs. 

MATERIALS AND METHODS 

ESTIMATION OF RESIDUAL FEED INTAKE AND DEREGRESSED 
ESTIMATED BREEDING VALUES 

Data were recorded during a 5-year period (2008-2012) and sup- 
plied by the Pig Research Centre of the Danish Agriculture and 



Food Council. A total of 596 Yorkshire pigs had both phenotypic 
(RFI) and genotypic records (based on PorcineSNP60 Illumina 
iSelect BeadChip). The method of calculation of RFI has been 
previously discussed in detail (Do etal, 2013a). In summary, 
RFI was computed as the difference between the observed aver- 
age daily feed intake and the predicted daily feed intake using 
two statistical models. In the first model (RFI1), predicted daily 
feed intake was estimated using linear regression of daily feed 
intake on initial test weight (BWd) and average daily gain from 
30 to 100 kg, whereas in the second model (RFI2), backfat was 
used as an additional regressor. The EBVs for RFI were calcu- 
lated using a univariate animal model where barn-year-season 
were used as fixed effects and the effect of pen and the additive 
genetic effect were treated as random effects. The pedigree was 
traced back to January, 1971 and included 14,681 pigs with 1951 
sires, 6766 dams. These EBVs were further deregressed as previ- 
ously described (Ostersen etal., 2011; Do etal., 2013b), following 
the deregression procedure of Garrick etal. (2009). This proce- 
dure adjusts for ancestral information, so that the deregressed 
EBV (dEBVs) only includes information from individual animals 
and their descendants. Since our resource population consists of 
5337 pigs of which only 1564 pigs had genotypic records, the use 
of deregressed proofs was intended to maximize use of phenotypic 
information from non-genotyped pigs. Because the dEBVs have 
unequal variances, they should be used in a weighted analysis. The 
weight for the zth animal was estimated as: 

(1 : h 2 ) 
[(c+(a-rf)/rf))h 2 ] 

in which c was the part of the genetic variance that was assumed 
to be not explained by markers (c = 0.1), h 2 was the heritability of 
the trait, and rf was the reliability of the dEBV of the rth animal. 

GEN0TYPING AND DATA QUALITY CONTROL 

The details of the resource population used and DNA collection 
were described in Henryon et al. (2001). For genotyping, genomic 
DNA was isolated from tissue by treatment with proteinase K fol- 
lowed by sodium chloride precipitation and SNPs were genotyped 
on the PorcineSNP60 Illumina iSelect BeadChip. Data quality 
control prior to GWA analyses was implemented by discarding ani- 
mals and SNPs with a call rate <0.95, SNPs deviating from Hardy 
Weinberg equilibrium {p < 0.0001) and SNPs with a MAF < 0.05. 

LINEAR MIXED MODEL USED FOR GENOME WIDE ASSOCIATION 
ANALYSES 

A univariate linear mixed model was implemented to test the 
association between each SNP and RFI. The model was similar 
with previous GWA analysis in Duroc pigs (Do etal, 2014). In 
summary, the model for each SNP (analyzed individually) was as 
follows: 

y = l(i + Za + mg + e 

where y is the vector of dEBVs for RFI, lis a vector of Is with 
length equal to number of observations, u, is the general mean, Z 
is an incidence matrix relating phenotypes to the corresponding 
random polygenic effect, a is a vector of the random polygenic 
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effect ~N(0, Aa 2 ) where A is the additive relationship matrix and 
a 2 u is the polygenic variance, m is a vector with genotypic indi- 
cators (—1, 0, or 1) associating records to the marker effect, g 
is a scalar of the associated additive effect of the SNP, and e is a 
vector of random environmental deviates: N(0, W~ 1 o 2 .) where a 2 
is the general error variance and W is the diagonal matrix con- 
taining weights of the dEBVs. The model was fitted by restricted 
maximum likelihood (REML) using the DMU software (Madsen 
et al., 2006) and testing was done using a Wald test against a null 
hypothesis of g = 0. The Wald test was based on a £-distribution 
and regression coefficients and SEs were obtained by solving lin- 
ear mixed model equations using DMU (Madsen etal., 2006). 
This test was done by the f-distribution function "ptQ" in R with 
p = 2*pt[abs(p7S£), (n - 3), log = FALSE, lower.tail = FALSE; 
where p" is regression coefficient, SE is standard error estimated 
based on the inverse of the mixed model coefficient matrix and 
n is number of SNPs in the genotype data] . Bonferroni corrected 
significance threshold, used to account for multiple comparisons, 
was estimated at a p = 1.31e— 06. However, Bonferroni correc- 
tion is known to be overly conservative especially when genetic 
data exhibits high LD, which could produce false negative results 
(Duggal etal, 2008). Therefore, in our analyses we considered a 
less conservative significance threshold of a p = le— 04 in order 
to account for multiple tests. The p was chosen here based on a 
Bonferroni adjustment only for the number of independent tests 
that was in turn inferred by the number of principal compo- 
nents accounting for a 99% of the variance of the SNP matrix 
(Gao etal., 2008). Moreover, to further characterize candidate 
regions affecting RFI, we performed LD block analyses for the 
chromosomal regions with multiple (or the most) significant SNPs 
clustered. The blocks were defined based on criteria suggested by 
Gabriel etal. (2002) which implemented in Haploview (Barrett 
etal, 2005). 

PATHWAY ANALYSES 
Assignment of genes to SNPs 

Assigning genes close to a few SNPs with high statistical signif- 
icance, and ignoring many SNPs with lower significance levels 
could result in missing out on key putative candidates and asso- 
ciated pathways. Hence, we used the procedure of controlling 
false discovery rate (FDR; Benjamini and Hochberg, 1995) to 
select SNPs for pathways analyses. All SNPs with a FDR (or 
q-value) <0.2 were used to identify putative candidate genes. 
Based on previous studies, we also included pathway annota- 
tions associated with genes within 50 kb of SNPs associated 
with RFI at a nominal significance thresold of 0.05, in path- 
way analyses (Choquette etal., 2012; Penagaricano etal., 2013). 
Positional candidate genes, located within 50 kb of these SNPs, 
were identified using function GetNeighGenesQ in the NCBI2R 
package 2 for R program (R Development Core Team, 2008). The 
distance of 50 kb was used in order to capture proximal regula- 
tory and other functional regions close to the gene. Moreover, 
several studies showed that the average LD was high in pigs. 
The average distance between adjacent SNP pairs (with average 
r 2 = 0.5) was around 72 kb in the current population (Wang 



2 http://cran.r-project.org/web/packages/NCBI2R/ index.html 



etal, 2013). Therefore, the distance of 50 kb was suitable to 
capture the causal genes/SNPs. Individually, each gene was con- 
sidered to be significantly associated with RFI if a SNP with a 
q-value <0.2 (as well as for relxed thresold at a p < 0.05) was 
located either inside the genie region or within 50 kb of the 
genes. 

Assignment of genes to Kyoto Encyclopedia of Genes and Genomes 
(KEGG) pathways and pathway analyses 

For functional annotation, Kyoto Encyclopedia of Genes and 
Genomes 3 (KEGG) was used for getting pathways. To assign genes 
to pathways, we used the function GetPathwaysQ in NCBI2R 
package; and to get number and names of genes in each path- 
way, the mapPathwaytonameO function 4 was used. Because RFI 
was production trait in pigs, the assigned pathways belonging to 
human diseases and drug development categories were removed. 
To determine whether a pathway term was significantly associated 
with RFI, we tested if genes significantly associated with RFI were 
overrepresented amongst all the genes of any given pathway. This 
association analysis was performed using a Fisher's exact test via 
the fisher. test() function in R. 

RESULTS 

GENOME-WIDE ASSOCIATION ANALYSES 

After data quality control, a total of 37,192 SNPs and 596 
pigs remained in the final dataset for GWA analyses. Eleven 
SNPs were significantly associated with RFI1 (Figure 1), 
and seven were associated with RFI2 (Figure 2); two SNPs 
(MARC0027992 and ASGA0039145) being associated with both 
traits [p < le— 04 (0.0001)]. The most significant associations 
were found between ASGA0039145 and RFI1 (p = 7.76e-06) and 
between MARC0027992 and RFI2 (p = 2.47e-05). Significant 
SNPs associated with RFI1 were located on SSC 3, 7, 8, 9, 15, and 
17 meanwhile those for RFI2 were located in SSC 1, 7, 8, 10, 14, and 
15 (Table 1). A search for genes located in the immediate 50 kbp 
vicinity of these SNPs revealed XIRP2, TTC29, SOGA1, MAS1, 
GRK5, PROX1, GPR155, and ZFYVE26 as putative candidates. 
Two LD blocks were detected on a candidate region (86-88 Mb on 
SSC8) which associated with both RFI (Figure 3). 

PATHWAY-BASED ASSOCIATION ANALYSES 

Based on results from GWA analyses, a total of 402 SNPs asso- 
ciated with RFI1 and another 304 SNPs associated with RFI2 
(based on a FDR threshold of q-value <0.2) were used to locate 
339 and 304 genes, respectively, within 50 kb of these SNPs 
for pathway analyses. A total of 21,296 genes in 50 kb flank- 
ing regions of SNPs which passed QC was used as background 
for enrichment test. Pathway analysis tests for KEGG pathways 
revealed a metabolic pathway to be associated with both RFI1 
(p = 0.008) and RFI2 (p = 0.002); and an additional olfac- 
tory transduction pathway only associated with RFI2 (p = 0.03). 
Repeating pathway analyses after relaxing the significance thresh- 
old to p < 0.05, revealed 15 and 12 pathways associated with RFI1 
and RFI2, respectively (Table SI in Supplementary Materials), of 



3 http://www.genome.jp/kegg/ 
4 http://biobeat.wordpress.com/category/r/ 
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FIGURE 1 | Manhattan plot of genome-wide p of association for residual feed intake 1 (RFI).The horizontal red and blue line represents the Bonferroni 
(p = 1.31 e-06) and genome-wide significance threshold (p < 1e— 04), respectively. 




FIGURE 2 | Manhattan plot of genome-wide p of association for RFI 2. The horizontal red and blue line represents the Bonferroni (p = 1.31e-06) and 
genome-wide significance threshold (p < 1e-04), respectively. 



which nine pathways were commonly associated with both RFI 
phenotypes. The pathways associated with both RFI phenotypes 
included metabolic pathway, olfactory transduction, tight junc- 
tion and phagosome pathway that were associated with both RFI 
traits atp < 0.01. 

DISCUSSION 

The statistical and bioinformatics methods used in the cur- 
rent study are similar to those applied in a separate study 
aiming to identify QTLs influencing RFI in Duroc pigs (Do 
etal., 2014). However, in order to account for multiple com- 
parisons, we used a less stringent genome-wide significance 
threshold (p < 0.0001) that adjusts the nominal significance 
threshold only for the number of independent tests that are 
possible given a particular dataset. The Bonferroni correction 
that was used in the previous study (Do etal., 2014), is known 



to overcompensate for multiple testing, especially when applied 
to correlated data. As such, the use of a relaxed threshold 
was expected to decrease the number of false negatives thereby 
increasing power. Several studies have applied pathway analy- 
sis on GWA datasets and reported pathways and GOs associated 
with backfat thickness (Fontanesi etal., 2012), feeding behavior 
(Do etal, 2013b), and RFI (Do etal, 2014) in pigs. However, 
these studies focused on genes in close proximity to signifi- 
cantly associated SNPs based on stringent genome-wide thresholds 
like Bonferroni, while ignoring many SNPs with lower signifi- 
cance levels. Therefore, we used a lower significance threshold 
including all genes located near SNPs associated with RFI at a 
q-value <0.2 in the current study. (Penagaricano etal., 2013) 
also used more relaxed threshold with p < 0.05 and detected 
many significant pathways and network from GWA data for 
bull fertility traits. Moreover, since the porcine genome contains 
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Table 1 I Genome-wide significant SIMPs for residual feed intake in Yorkshire. 
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1 RFI1: residual feed intake 1; RFI2: residual feed intake 2. 

2 Pig chromosome. 

3 Based on Sscrofa10.2 (GCA_000003025.4) at http://www.ensembl.org/Sus_scrofa/lnfo/lndex 



many uncharacterized genes, widely used annotation servers 
like DAVID 5 and GOEAST 6 are of limited use, as they per- 
form poorly in converting porcine gene IDs. Therefore, we 
performed functional annotation of genes by directly query- 
ing KEGG databases that are better able to handle porcine 
gene IDs. 

Comparing results from our previous study targeting a Duroc 
resource population almost twice in size, we were able to iden- 
tify overlapping QTLs at approximately 87 Mb on SSC 8, and 
at approximately 136 Mb on SSC 15 (Table 1; Do etal, 2014). 
The proportionally small number of overlapping QTLs is in 
agreement with other studies that have investigated QTLs for 
a specific trait within different pig breeds (Gregersen etal., 
2012). Furthermore, despite of substantial differences in the 
location of QTL regions between the two breeds, pathway anal- 
yses identified many pathways (e.g., insulin regulation related 
pathways and cellular communication pathways) that were also 
identified in our previous study with Duroc. Taken together, 
these observations reaffirm the notion that while the biolog- 
ical mechanisms underlying a particular phenotype do not 



5 http://david. abcc.ncifcrf.gov/ 
6 http://omicslab.genetics. ac.cn/GOEAST/ 



differ, the genetic regulation of these mechanisms can differ 
between breeds. This is particularly important in the context 
of developing marker-assisted and genomic selection strate- 
gies, as it demonstrates that improving the same trait may 
require different sets of markers for different lines/breeds of 
livestock. 

CANDIDATE GENES FOR RESIDUAL FEED INTAKE 

Significant associations between both RFI and SNPs on SSC 7 
(MARC0027992) and8 (ALGA0039145) were foundin the present 
study. Here, ENSSSCG00000018237 encoding Ul spliceosomal 
RNA was found near a SNP significantly associated with both 
RFIs on SSC 7. Ul spliceosomal RNA constitutes Ul small nuclear 
ribonucleoprotein that plays a role in splicing of pre-mRNAs 
(Zwieb, 1996). Another independent study (Onteru etal., 2013) 
reported a QTL region between 16 and 17 Mb on SSC 7 in York- 
shire pigs for RFI. In this study, a SNP significantly associated 
with RFI1 was found within an intron 4-5 of zinc finger, FYVE 
domain containing the 26 (ZFYVE26) gene that is also located on 
SSC 7. The gene encodes a protein containing a FYVE zinc fin- 
ger binding domain which helps target the protein to membrane 
lipids (Laity etal, 2001). While ZFYVE26 has been associated 
with autosomal recessive spastic paraplegia in humans (Herd et al., 
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FIGURE 3 | Linkage disequilibrium (LD) pattern on a region from 86 to 88 Mb on pig chromosome 8. LD blocks are marked with triangles. Values in boxes 
are LD (f 2 ) between SNP pairs and the boxes are red indicated LOD > 2 and D' = 1 (LOD is the log of the likelihood odds ratio, a measure of confidence in the 
value of D'). 



2004), the precise biological function of this gene has not yet been 
described. 

Another important association was between SNP ASGA0039 145 
on SSC 8 and both RFI traits. This SNP is located in a genomic 
region where QTLs influencing FCR have been reported in an 
independent study based on a Duroc resource population (Sahana 
etal, 2013). The ENSSSCG00000009034 is a gene closest to this 
SNP; however, the gene has not been functionally characterized yet. 
Moreover, this SNP is tightly linked with five other SNPs to form 
the LD block 1 (Figure 3). This LD block spans 487 kb region and 
consists three significant SNPs for RFI1. This LD block also covers 
a TTC29 gene which encodes a testis development protein that 
could also be an interesting candidate for further investigation. 
Also known as NYD-SP14, TTC29 is a component of axonemal 
dyneins (Yamamoto et al., 2008) that have recently been demon- 
strated to play an important role in fat metabolism (Sohle et al., 
2012). 

We also found many SNPs to be associated with either RFI1 
or RFI2 in our analyses. On SSC3, the SNP H3GA0010038 
associated with only RFI1 located closest to a novel gene. 
On SSC 9, another transcriptional factor PROX1 was found 
proximal to an SNP exclusively associated with RFI1. PROX1 
likely plays a fundamental role in the early development of 
the central nervous system (Kaltezioti etal., 2010). It also is 
a key regulator of lymphatic endothelial cell fate specification 
(Ma, 2007), ERRa mediated control of the molecular clock 
(Choquette etal, 2012), and modulation of insulin sensitivity 
and glucose handling (Fontanesi etal, 2012) that could influ- 
ence energy metabolism and RFI. On SSC17, MARC0067053 
is significantly associated with RFI1 (p = 1.08e— 04), and 
is located in 5UTR' region of SOGA1 gene. This gene 
encodes a SOGA1 protein that plays a role in reducing glu- 
cose production (Cowerd etal, 2010). It also contributes to 



adiponectin-mediated insulin-dependent inhibition of autophagy 
(Forbes, 2010). Since autophagy provides biochemical intermedi- 
ates for glucose production (Forbes, 2010) which influences feed 
consumption, SOGA1 could be an interesting candidate gene for 
RFI1. 

Moreover, on SSC 1 and SSC 14, close to significant SNPs, 
we reported two members of G-protein couple receptor {MAS1 
and GRK5) as possible candidate genes for RFI2. For instance, 
the GRK5 regulates the GPCR signaling pathway and GRK5 
deficiency led to insulin resistance and hepatic steatosis, or 
decreases diet-induced obesity and adipogenesis in mice (Wang 
et al., 2012b). On chromosome 10, a functionally uncharacterized 
gene ENSSSCG00000011017 encoding a lysozyme-like ortholog, 
was located near ALGA0117721 that was significantly associated 
with RFI2. Other candidate genes in proximity to SNP associated 
exclusively with RFI1 or RFI2 were RPS18, GPR155, and XIRP2. 
Dysregulation of GPCR 155 (GPR155) is associated with higher 
feed efficiency in chicken while RPS18, encoding the 40S riboso- 
mal protein S18, is involved in regulation of development (Laity 
etal, 2001). However, very little is known about the biological 
function of these genes and their relevance in the context of RFI is 
not apparent. 

Notably, melanocortin 4 receptor (MC4R; SSC 1: 178,553,488- 
178,555,219) is perhaps most well-known candidate gene for feed 
efficiency or/ and feed intake in pigs (Kim etal., 2000; Houston 
etal, 2004; Burgos etal, 2006; Davoli etal, 2012; Onteru etal, 
2013). The MC4R gene codes for a G protein transmembrane 
receptor playing an important role in energy homeostasis control. 
In pigs, a SNP (missense substitution 298 Asp > Asn) in MC4R 
gene has been identified and associated to average daily gain, feed 
intake and fatness traits in many difference studies (Kim etal., 
2000; Houston etal, 2004; Fan etal, 2009; Davoli etal, 2012). 
Moreover, Leptin (SSC 18: 21, 201, 786-21, 204, 671) plays a key 
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role in regulating energy intake and expenditure and is a candidate 
gene for feed efficiency in pigs (Barb etal., 2001). However, vari- 
ants in these genes are not included the PorcineSNP60 Illumina 
iSelect BeadChip and it is unclear whether such variants could 
influence RFI in the current population. 

STATISTICAL METHODS USED IN GWAS 

One of the challenges for doing GWAS in livestock population is 
the large proportion of animals have phenotypic but no geno- 
typic records, especially in dairy cattle. Recently, Wang etal. 
(2012a) proposed a single step GBLUP (or ssGBLUP) method 
that incorporates all genotypes, observed phenotypes and pedi- 
gree information jointly in one step and provides GEBVs for all 
animals with or without genomic data or phenotypic data or both 
(based on the methods of Aguilar etal, 2010; Christensen and 
Lund, 2010). The ssGWAS is a method based on GBLUP (Wang 
etal., 2014) which derives the SNP effects (or SNP variance) from 
GEBVs calculated from ssGBLUP. However, the use of the ssGWAS 
method is limited by finding appropriate number of iterations 
required to get marker solutions and most importantly, its inabil- 
ity to determine the genome-wide significance level for each SNP 
in the entire genomic data. However, our approach in this study 
of combining GWAS with pathways analysis requires a genome- 
wide adjusted p-value for each SNP for selecting the top SNPs 
for further downstream gene enrichment and pathway analyses 
using bioinformatics tools. Hence the genome-wide p-values for 
each SNP are not possible in ssGWAS method, we have adapted 
a mixed model GWAS and implemented using DMU package 
(Madsen etal., 2006). We used deregressed EBVs as a pseudo- 
phenotype but ssGBLUP would have also been a better choice in 
that it handles variability far better than the use of deregressed 
EBV as Wang etal. (2012a) reported differences in accuracy of 
genomic breeding values compared to other methods including 
classical GWAS. 

There is still some controversy as to how to properly determine 
SEs of estimated SNP effects by GBLUP based-methods. Recently, 
Gualdron Duarte et al. (2014) provided a way to determine signif- 
icance values for each SNP marker effect by linear transformations 
of genomic evaluations. Briefly, the likelihood ratio is calculated 
to test the significance of the largest effect segment of each chro- 
mosome by comparing against a reduced model with fixed effects 
and GEBVs. The critical value (size of the test) is adjusted by the 
Bonferroni correction. Moreover, we also would like to note that 
our GWA model could be extended to the case where the addi- 
tive genetic relationship is substituted by the genomic relationship 
matrix like in EMMAX 7 . 

PATHWAYS INVOLVED IN RESIDUAL FEED INTAKE 

Results from gene-set enrichment analyses are largely dependent 
on how gene-sets are identified or defined. In the current study, 
our gene-set was determined by the significance threshold that 
was used to declare SNPs significantly associated with RFI. Con- 
sequently, our enrichment analyses was very dependent on our 
choice of significance threshold. The choice of significance thresh- 
old also influences the degree of confidence that can be ascribed 



7 http://genetics. cs.ucla.edu/emmax/ 



to results from gene-set enrichment analyses. Choosing a strin- 
gent threshold like Bonferroni will likely yield very few results 
with higher confidence as opposed to a lenient threshold that will 
likely yield more results with lower confidence. In our analyses, 
we decided to use a FDR based q- value threshold of 0.2 to balance 
the number of results and the degree of confidence associated with 
them. Applying more stringent FDR thresholds (for e.g., of 0.05 or 
0.10) significantly reduced the numbers of SNP, and consequently 
the number of genes in the gene-set, for pathway annotation. 
Therefore, by setting the at a q- value <0.2 (p ~ 0.001; mean- 
ing 20% of SNPs using for pathway analyses are likely to be false), 
we had a reasonable number of SNPs for gene-set enrichment 
analyses. 

Regardless to different thresholds, the metabolic pathway 8 
was significantly associated with both RFI traits. Many previous 
studies have shown that variation in mediators of metabolic pro- 
cesses contribute to the variation of RFI (e.g., reviewed in Herd 
etal, 2004; Herd and Arthur, 2009; Hoque and Suzuki, 2009; 
Dekkers and Gilbert, 2010). However, the metabolic pathway is 
a broad overarching term that contains many specific modules 
(e.g., energy, carbohydrate and lipid, nucleotide and amino acid 
and secondary metabolism). So future investigations evaluating 
the contribution of specific sub-modules within this pathway to 
the genetic variation in RFI might be warranted. An interesting 
pathway associated with RFI2 (and with RFI1 when analysis is 
performed based on a nominal threshold of a p < 0.05) was 
related to olfactory transduction. Olfactory transduction path- 
ways are responsible for the perception of odor via olfactory 
receptors and downstream biochemical signaling events that ulti- 
mately get transformed into electrical impulses sent to the brain 
(Ma, 2007). Pigs have the largest repertoire of functional olfac- 
tory receptors (Groenen etal., 2012) encoded by at least 1113 
genes (Nguyen etal., 2012), 14 and 11 of which are located near 
SNPs significantly associated with RFI1 and RFI2, respectively, 
(Table SI in Supplementary Materials). Olfaction is one of the 
major sensory modalities that contribute to hedonic evaluation 
of a food, resulting in food choice and its possible consump- 
tion. It is modulated in response to changing levels of various 
molecules, such as ghrelin, orexins, neuropeptide Y, insulin, lep- 
tin, and cholecystokinin (Palouzier-Paulignan etal., 2012). These 
molecules are known to play an important role in controlling 
of RFI. For example, genetic selection for low and high RFI in 
pigs has been shown to change leptin concentration in plasma 
(Lefaucheur etal, 2011). Lower RFI has also been shown to be 
associated with lower serum leptin concentrations in Duroc pigs 
(Hoque etal, 2009). 

Another interesting pathway significantly associated with both 
RFI traits (when using a nominal threshold of a p < 0.05) was the 
insulin secretion pathway (Table SI in Supplementary Materials). 
Do etal. (2014) also reported Insulin signaling pathway associ- 
ated with RFI in Duroc pigs. Insulin-dependent regulation of feed 
intake has been described in many species including cattle (Chen 
etal, 2012; Rolf etal, 2012) and pigs (Colditz, 2004; Cruzen etal, 
2012). The results imply that insulin secretion is possibly an inter- 
mediary pathway by which olfactory transduction influences RFI. 



http://www.genome.jp/kegg-bin/show_pathway?org_name— ssc&mapno=01100 
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Richardson and Herd (2004) indicated that differences in some 
plasma metabolites and hormones have been positively related to 
genetic and phenotypic measures of RFI in ruminants. 

The genetic variants identified by GWA studies may facili- 
tate the incorporation of marker-assisted selection in commercial 
breeding schemes for improvement of complex traits. Moreover, 
Snelling etal. (2013) and Kadarmideen (2014) have suggested 
that genomic selection could perform better if it is guided by 
network and pathway analysis. Biological pathways identified by 
post-GWA analyses could further our current understanding of 
the genetic underlying different complex traits. Therefore, our 
results would be of interest not only to breeders interested in 
using marker-assisted selection to improve feed efficiency in pigs, 
but also to biologists interested in better understanding the bio- 
logical mechanisms influencing feed efficiency. However, it is 
also important to consider potential limitations of our study, 
such as the limited size of Yorkshire resource population, statis- 
tical model used in the estimation of RFI, and statistical models 
used in GWA; gene set enrichment and pathway analyses. Finally, 
it is also important to note that all results reported in this 
study are only relevant to the specific definition used in this 
study. 

In summary, the present study describes SNPs, candidate genes 
and biological pathways putatively influencing RFI in Yorkshire 
pigs. Important candidate genes such as XIRP2, TTC29, SOGA1, 
MAS1, GRK5, PROX1, GPR155, and ZFYVE26 were identified here 
that could be further investigated for harboring causal variants. 
Pathway analyses identified metabolic and olfactory transduction 
pathways to be associated with RFI. Many other pathways (such 
as insulin secretion, tight junction) that were found to be associ- 
ated with RFI based on a lenient nominal significance threshold 
might be of some interest. However, more studies are required to 
determine their role in regulating RFI. 
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